Method for implementing a multiplier-less FIR filter

ABSTRACT

A method for implementing a multiplier-less FIR filter is disclosed, in which the FIR filter performs the convolution of  
           H        (   z   )       =       ∑     k   =   0       N   -   1                         h   k          z     -   k             ,                 
 
     where h k  is the k-th coefficient of the FIR filter. The dynamic range of the filter coefficients is compressed by a transformation:  
             H   ′          (   z   )       =       H        (   z   )                ∏     i   =   1       m   -   1                         (     1   +       α   i          z     -   β           )     m           ∏     i   =   1       m   -   1                         (     1   +       α   i          z     -   β           )     m             ,                 
 
     where parameters α and β are chosen depending on filter type, −1≦α i ≦1, and m denotes iteration numbers of coefficient operation of transformation, so as to avoid the serious quantization error caused by the phenomenon of SPT distribution. Then, the compressed coefficients are quantized into SPT numbers, and the coefficients are optimized by removing redundant STP numbers.

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] The present invention relates to the technical field of finite impulse response filter (FIR) and, more particularly, to a method for implementing a multiplier-less FIR filter based on modified DECOR transformation and trellis de-allocation scheme.

[0003] 2. Description of Related Art

[0004] Conventionally, the finite impulse response (FIR) filter is one of the key functional blocks in many digital signal processing (DSP) applications. In time domain representation, an N-tap FIR filter performs the following convolution: $\begin{matrix} {{{y\lbrack n\rbrack} = {\sum\limits_{k = 0}^{N - 1}\quad {h_{k}{x\left\lbrack {n - k} \right\rbrack}}}},} & (1) \end{matrix}$

[0005] where h_(k) is the k-th coefficient of the FIR filter; x(n) and y(n) denote the input and the output signals at time instance n, respectively. Such a FIR filter architecture is shown in FIG. 1. As shown, the multiplicative operation is the fundamental function in the FIR filter structure, which may cause a severe problem because the array multiplier consults a large layout area and tremendous power consumption.

[0006] To implement a multiplier-less FIR filter, the sign power-of-two (SPT) number is employed to express the coefficient of the FIR filter as follows: ${h_{i} \cong {\sum\limits_{k = 1}^{L}\quad {S_{k}2^{- k}}}} = {D_{n - 1}\Lambda \quad D_{1}D_{0}}$

[0007] where S_(k)ε{−1,0,1}, L is the number of nonzero digits and n is the coefficient wordlength. That is, a coefficient can be quantized as the summation of several nonzero digits and the residue. Therefore, one multiplier can be implemented by several adders to perform shift-and-add operations.

[0008] In such a multiplier-less FIR filter implementation, the number of nonzero digits employed determines the cost of the FIR filter. Therefore, the increase of the number of nonzero digits will increase the hardware complexity. To avoid employing complicated hardware, it is desired to control the number of nonzero digits to be smaller than a predetermined value.

[0009] However, FIG. 2 demonstrates the distribution of 2-nonzero-digit SPT numbers between 0.0 and 1.0 for wordlength 7, 8, and 9, respectively. As we can see, the gaps of the SPT distribution exceeded 0.5 cannot be reduced even if the wordlength of the SPT numbers increases. To achieve higher precision performance (reducing the gaps), we have to employ more non-zero digits. However, increasing the non-zero digit has the effect of increasing the number of adders in each filter tap.

[0010] In the known 1st-order differential coefficient method (DCM), equation (1) can be reformulated as follows: $\begin{matrix} \begin{matrix} {{y\lbrack n\rbrack} = {\sum\limits_{k = 0}^{N - 1}{\left( \quad {h_{k} - h_{k - 1} + h_{k - 1}} \right){x\left\lbrack {n - k} \right\rbrack}}}} \\ {= {{h_{0}{x\lbrack n\rbrack}} + {\sum\limits_{k = 1}^{N - 1}{\delta_{k}^{1}{x\left\lbrack {n - k} \right\rbrack}}} + {y\left\lbrack {n - 1} \right\rbrack} - {h_{N - 1}{{x\left\lbrack {n - N} \right\rbrack}.}}}} \end{matrix} & (2) \end{matrix}$

[0011] where δ¹ _(k)≡h_(k)−h_(k−1). The “1st-order” operation denotes the difference between the contiguous coefficients is taken only once. The corresponding structure of the DCM-based FIR structure is depicted in FIG. 3, wherein the extra cost of the 1st-order DCM is one additional tap and one accumulator, as circled by the dotted line. For m-th-order DCM, the coefficients are generated by taking the difference of the (m−1)th-order DCM coefficients as

δ^(m) _(k−m/k)≡δ^(m−1) _(k−m+1/k)−δ^(m−1) _(k−m/k−1).   (3)

[0012] The effectiveness of the DCM reduced the dynamic range of filter coefficient significantly as shown in FIG. 4. The reduction of dynamic range of FIR filter implies that the wordlength can be reduced.

[0013] An alternative representation for FIR filter in z-domain is given in Eq. (4). H(z) is named as the transfer function of the filter. $\begin{matrix} {{H(z)} = {\sum\limits_{k = 0}^{N - 1}\quad {h_{k}{z^{- k}.}}}} & (4) \end{matrix}$

[0014] From z-domain point of view, it is able to represent the first order DCM of the FIR filter as $\begin{matrix} {{{H^{\prime}(z)} = \frac{{H(z)}\left( {1 - z^{- 1}} \right)}{\left( {1 - z^{- 1}} \right)}},} & (5) \end{matrix}$

[0015] where H(z) is the transfer function of original FIR filter. It can be seen that the transfer function is the same as long as the introduced term, (1−z⁻¹), is fully cancelled in both denominator and numerator of Eq. (5). Furthermore, this transformation equivalents to inserting a pole-zero pair on the real axis of z-plane with z=1. For m-th order DCM, m pairs pole and zero are located on the same position. In addition, Eq. (5) can be generalized to the DECOR transformation, in which the transfer function is rewritten as $\begin{matrix} {{H^{\prime}(z)} = {{H(z)}{\frac{\left( {1 + {\alpha \quad z^{- \beta}}} \right)^{m}}{\left( {1 + {\alpha \quad z^{- \beta}}} \right)^{m}}.}}} & (6) \end{matrix}$

[0016] The parameters of α and β are chosen depending on the filter type, and m denotes the order (iteration numbers of coefficient operation) of DECOR as listed in Table 1. As known, following DECOR transformation can reduce the coefficient dynamic range in all kinds of FIR filter. TABLE 1 Filter Type α β F(z) Low-pass −1   1 (1 − z⁻¹)^(m) High-pass 1 1 (1 + z⁻¹)^(m) Band-pass(ω_(c): center frequency) 1 π/ω_(c) (1 − z^(−π/ωc))^(m) Band-stop −1   2 (1 − z⁻²)^(m)

[0017] In the design of FIR filter based on the SPT term, quantizing the coefficient after DECOR transformation will encounter the stability problem. Therefore, quantization must be held before DECOR transformation.

[0018] This procedure ensures that DECOR based FIR filter can be implemented by shift-and-add operation instead of array multiplier. However, it conforms serious coefficient quatization problem since filter coefficient may exceed 0.5. Quantizing these coefficients then transferring them into a smaller dynamic range only helps reduce wordlength but quantization error. As a result, DCM and DECOR based FIR filter still needs more nonzero terms to prevent serious quantization problem in quantizing larger coefficients. Therefore, it is desired for the discussed FIR filter to be improved, so as to migrate and/or obviate the aforementioned problem.

SUMMARY OF THE INVENTION

[0019] It is one object of the present invention to provide a method for implementing a multiplier-less FIR filter which can avoid the serious quantization error caused by non-uniform distribution of SPT numbers.

[0020] As we can see in FIG. 2, most of the values that can be represented under limited nonzero terms are crowed in the region with a smaller value. Coefficient quantization error can be avoided if all of the filter coefficients to be quantized only exist in this region. As a consequence, we would like to compress the dynamic range of filter coefficient before SPT quantization so as to avoid the serious quantization problem. And this is the main concept of this invention in providing a method for implementing a lower cost multiplier-less FIR filter since we do not need additional adder to compensate serious quantization error.

[0021] To achieve the objects, there is provided a method for implementing a multiplier-less FIR filter that performs the convolution of ${{H(z)} = {\sum\limits_{k = 0}^{N - 1}\quad {h_{k}z^{- k}}}},$

[0022] where h_(k) is the k-th coefficient of the FIR filter. The method comprises the steps of: (A) compressing the dynamic range of filter coefficients by a transformation: ${{H^{\prime}(z)} = {{H(z)}\frac{\prod\limits_{i = 1}^{m - 1}\quad \left( {1 + {\alpha_{i}z^{- \beta}}} \right)^{m}}{\prod\limits_{i = 1}^{m - 1}\quad \left( {1 + {\alpha_{i}z^{- \beta}}} \right)^{m}}}},$

[0023] where parameters α and β are chosen depending on filter type, −1≦α_(i)≦1, and m denotes iteration numbers of coefficient operation of transformation; and (B) quantizing the compressed coefficients obtained in step (A) into SPT numbers.

[0024] The coefficients generated by step (A) and step (B) could be the combination of hundreds of SPT terms. Some of these SPT terms are the redundancy that means removing these SPT terms will not affect filter performance seriously. Hence, another object of this invention is further optimizing filter coefficients so as to achieve lowest cost. The proposed optimization procedure is finding an efficient way that can choose the SPT terms could be dropped with little performance degradation.

[0025] Other objects, advantages, and novel features of the invention will become more apparent from the following detailed description when taken in conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0026]FIG. 1 shows a conventional FIR filter architecture;

[0027]FIG. 2 shows the distribution of SPT numbers;

[0028]FIG. 3 shows a DCM-based FIR structure;

[0029]FIG. 4 shows the effectiveness of the DCM in significantly reducing the dynamic range of filter coefficient;

[0030]FIG. 5 shows a flowchart of the method for implementing a multiplier-less FIR filter in accordance with the present invention;

[0031]FIG. 6 illustrates that different values of α_(i) are assigned to optimize filter performance in the m-th order MDECOR transformation;

[0032]FIG. 7 schematically illustrates the initialization and SPT term removing steps in the trellis de-allocation; and

[0033]FIG. 8 schematically illustrates the de-accumulation and determination steps in the trellis de-allocation.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

[0034] With reference to FIG. 5, there is shown a preferred embodiment of the method for implementing multiplier-less FIR filters, which includes the steps of: (Step 1) compressing the dynamic range of filter coefficients into a smaller set; (Step 2) quantizing these pre-processed coefficients, which are generated by step 1, into SPT numbers; and (Step 3) optimizing the coefficients by removing redundant STP numbers.

[0035] In step 1, a Modified DECOR (MDECOR) transformation is employed to compress the dynamic range of the filter coefficients. The transfer function of the m-th order MDECOR is represented as $\begin{matrix} {{H^{\prime}(z)} = {{H(z)}{\frac{\prod\limits_{i = 1}^{m - 1}\left( {1 + {\alpha_{i}z^{- \beta}}} \right)^{m}}{\prod\limits_{i = 1}^{m - 1}\left( {1 + {\alpha_{i}z^{- \beta}}} \right)^{m}}.}}} & (7) \end{matrix}$

[0036] where parameter α_(i) has the constraint −1≦α_(i)≦1. Note that the value of parameter β and sign of α_(i) are still determined depending on filter types listed Table 1. However, the value of α_(i) can be chosen as an arbitrary value with absolute value less than or equals to 1.

[0037] With this MDECOR transformation, the poles in Eq. (7) will not cause serious distortion even the zeros drift away from the original location. Furthermore, different values of α_(i) can be assigned to optimize filter performance in the m-th order MDECOR transformation as shown in FIG. 6.

[0038] This phenomenon support highly flexibility in coefficient compression that means a smaller dynamic range of filter coefficient can be expected. As we mentioned above, the quantization error can be eased if the coefficient to be quantized has a smaller value. That implies MDECOR suffers less error due to the highly compression ability in choosing various value of α_(i). This makes the MDECOR based FIR filter can be realized with a lower hardware complexity since it is no need in applying additional nonzero terms to solve serious quantization problem. Hence, a better frequency response and Signal-to-Quantize Noise Ratio (SQNR) performance can be expected. Suppose that it is desired to implement a FIR filter by SPT multipliers but the performance cannot satisfy the specification under limited numbers of non-zero digits. The present MDECOR can be employed to achieve the target precision performance, instead of using more adders in practical implementation. Hence, a lower cost can be achieved.

[0039] After performing step 1, the multiplier-less FIR filter is provided with compressed coefficients, denoted as h(1), h(2), h(3), . . . h(n). In step 2, these pre-processed coefficients are quantized into SPT numbers. Assumed that there are four compressed coefficients h(1), h(2), h(3), and h(4), the following quantization is given as an example:

h(1)=−2⁻⁴,

h(2)=−2⁻⁴−2⁻⁶,

h(3)=−2⁻⁴−2⁻⁸, and

h(4)=2⁻⁷.

[0040] After performing step 2, the filter coefficients has been transformed by MDECOR algorithm and quantized into SPT numbers, whereby a multiplier-less filter can be realized to meet the filter specification. However, the processed (includes transformation and quantization) coefficients may contain some redundant SPT terms since the filter specification issue is not considered in the aforementioned steps. As a consequence, removing some SPT terms in the processed coefficients will not degrade frequency response seriously and also satisfy the filter specification assigned by the filter designer.

[0041] In step 3, a trellis de-allocation procedure is employed to optimize the coefficients. Basically, the trellis de-allocation is the modified version of known trellis allocation algorithm. The trellis allocation can be proposed to add successive SPT terms, one at a time, to the filter that benefits the filter's frequency response most. This manner will be terminated until the best filter encountered meets the design specification. The trellis de-allocation algorithm, as well as trellis allocation algorithm, selects the SPT term that can benefits filter's frequency response most in each stage. However, the trellis de-allocation removes the selected SPT term instead of adding it in the previous stage. Such a trellis de-allocation includes the following steps:

[0042] (Step 3-1) Initialization

[0043] In the first stage of trellis de-allocation, all SPT terms of the coefficients are allocated. As shown in FIG. 7, the allocated SPT terms are: h(1)=−2⁻⁴, h(2)=−2⁻⁴−2⁻⁶, h(3)=−2⁻⁴−2⁻⁸, and h(4)=2⁻⁷. In the k=K stage, all SPT terms, q_(−4,1), q_(−4,2), q_(−6,2), q_(−4,3), q_(−8,3) and q_(7,4) are listed as candidates to be removed, wherein q_(x,y) denotes that there is a 2^(−|x|) in the y-th coefficient, and the sign of x denotes the addition/subtraction of the operation.

[0044] (Step 3-2) De-Accumulation

[0045] As each SPT term in the trellis de-allocation can be a candidate to be removed. Therefore, as shown in FIG. 8, in the k=K−1 stage, all SPT terms, q_(−4,1), q_(−4,2), q_(−6,2), q_(−4,3), q_(−8,3) and q_(7,4) are listed as candidates to be removed. Each SPT term in the (K−1)-th stage is linked to all other SPT terms in the K-th stage. Each link corresponds to an operation that determines the frequency response when the linked SPT term in the (K−1)-th stage is removed. Such a process is repeated until all the stage cannot satisfy filter specifications.

[0046] (Step 3-3) Determination

[0047] For each SPT term in each stage, the link that results in a best frequency response with a lowest cost is selected and the links that yield the filter specification is retained the surviving path, as shown in FIG. 8, thereby determining the redundant SPT terms.

[0048] In view of the foregoing, it is known that the present invention utilizes the MDECOR transformation to reduce the serious quantization error caused by non-uniform distribution of SPT numbers. The MDECOR compresses the magnitude of coefficients before quantizing them. Also, the present invention optimizes these coefficients by trellis de-allocation algorithm. Therefore, if one intends to implement a FIR filter by SPT multipliers but the performance cannot satisfy the specification under limited numbers of non-zero digits, it is applicable to employ the MDECOR to achieve the target precision performance, instead of using more adders in practical implementation, and then apply trellis de-allocation algorithm. Accordingly, a lower cost can be achieved in designing the multiplier-less FIR filters.

[0049] Although the present invention has been explained in relation to its preferred embodiment, it is to be understood that many other possible modifications and variations can be made without departing from the spirit and scope of the invention as hereinafter claimed. 

What is claimed is:
 1. A method for implementing a multiplier-less FIR filter that performs the convolution of ${{H(z)} = {\sum\limits_{k = 0}^{N - 1}{h_{k}z^{- k}}}},$

where h_(k) is the k-th coefficient of the FIR filter, and x(n) and y(n) denote the input and the output signals at time instance n, respectively, the method comprising the steps of: (A) compressing the coefficients of the FIR filter; and (B) quantizing the compressed coefficients obtained in step (A) into SPT numbers.
 2. The method as claimed in claim 1, wherein step (A) is performed by compressing the dynamic range of filter coefficients by a transformation: ${{H^{\prime}(z)} = {{H(z)}\frac{\prod\limits_{i = 1}^{m - 1}\left( {1 + {\alpha_{i}z^{- \beta}}} \right)^{m}}{\prod\limits_{i = 1}^{m - 1}\left( {1 + {\alpha_{i}z^{- \beta}}} \right)^{m}}}},$

where parameters α and β are chosen depending on filter type, −1≦α_(i)≦1, and m denotes iteration numbers of coefficient operation of transformation.
 3. The method as claimed in claim 2, further comprising a step (C) for optimizing the coefficients by removing redundant STP numbers.
 4. The method as claimed in claim 3, wherein, in step (C), the coefficients is optimized by a trellis de-allocation method comprising the steps of: (C1) allocating all SPT terms of the coefficients; (C2) at a k=K−1 stage, linking each SPT term to all other SPT terms in a K-th stage, wherein each link corresponds to an operation that determines the frequency response when the linked SPT term in the (K−1)-th stage is removed; (C3) repeating step (C2) until all the stage cannot satisfy filter specifications; and (C4) for each SPT term in the previous stage, selecting the link that results in a best frequency response and retaining the links that yield the filter specification as a surviving path, thereby determining the redundant SPT terms.
 5. The method as claimed in claim 1, wherein, in step (B), the compressed coefficients are quantized into SPT numbers as follows: ${h_{i} \cong {\sum\limits_{k = 1}^{L}{S_{k}2^{- k}}}} = {D_{n - 1}\Lambda \quad D_{1}D_{0}}$

where h_(i) denotes a compressed coefficient, S_(k)ε{−1,0,1}, L is the number of nonzero digits and n is the coefficient wordlength.
 6. The method as claimed in claim 2, wherein a |α_(i)|≦1, sign (α_(i))=−1, and β=1 if the filter is a low-pass filter.
 7. The method as claimed in claim 2, wherein a |α_(i)|≦1, sign (α_(i))=1 and β=1 if the filter is a high-pass filter.
 8. The method as claimed in claim 2, wherein |α_(i)|≦1, sign (α_(i))=1 and β=π/ω_(c) if the filter is a band-pass filter and ω_(c) denotes the central frequency of the band-pass filter.
 9. The method as claimed in claim 2, wherein |α_(i)|≦1, sign (α_(i))=−1 and β=2 if the filter is a band-stop filter.
 10. The method as claimed in claim 2, wherein the value of a can be assigned either equals or not equals to the others. 